-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix Prometheus recording rules to not use legacy metric names. #13718
Conversation
70b50a5
to
050799b
Compare
Signed-off-by: Olivier Wilkinson (reivilibre) <[email protected]>
050799b
to
3bd3255
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Daunting but looks sane!
Have you had a chance to test this?
This is compatible with both 'legacy' Synapse installations and experimental 'non-legacy' Synapse installations.
Because both configurations report the futureproof metrics?
"synapse_federation_client_sent_pdu_destinations:count", | ||
"synapse_federation_client_sent_pdu_destinations_count", | ||
"Number of PDUs queued for sending to one or more destinations", | ||
) | ||
|
||
sent_pdus_destination_dist_total = Counter( | ||
"synapse_federation_client_sent_pdu_destinations:total", | ||
"synapse_federation_client_sent_pdu_destinations", | ||
"Total number of PDUs queued for sending across all destinations", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(N.B. I also fix a metric that I forgot to rename in #13540)
Any way to check if we've missed any others? I guess grep for "synapse_.*:.*"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have thought of that; suppose I never really twigged that they all have a convenient synapse_
prefix that makes them greppable instead of 'string with colon' in :-)
Found another pair of sneaky ones ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never really twigged that they all have a convenient synapse_ prefix
I'm just guessing here!!
Could also grep for something like Counter\(\n?.*".*:
and Gauge\(\n?.*".*:
if you can get the multiline regex stuff working
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conventionally we definitely are meant to use an easily-recognised prefix like synapse_
. I don't think even we would ignore that...
yep, exactly, we have been doing so for a long time :) |
This appears to work on librepush.net. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Synapse 1.68.0 (2022-09-27) =========================== Please note that Synapse will now refuse to start if configured to use a version of SQLite older than 3.27. In addition, please note that installing Synapse from a source checkout now requires a recent Rust compiler. Those using packages will not be affected. On most platforms, installing with `pip install matrix-synapse` will not be affected. See the [upgrade notes](https://matrix-org.github.io/synapse/v1.68/upgrade.html#upgrading-to-v1680). Bugfixes -------- - Fix packaging to include `Cargo.lock` in `sdist`. ([\matrix-org#13909](matrix-org#13909)) Synapse 1.68.0rc2 (2022-09-23) ============================== Bugfixes -------- - Fix building from packaged sdist. Broken in v1.68.0rc1. ([\matrix-org#13866](matrix-org#13866)) Internal Changes ---------------- - Fix the release script not publishing binary wheels. ([\matrix-org#13850](matrix-org#13850)) - Lower minimum supported rustc version to 1.58.1. ([\matrix-org#13857](matrix-org#13857)) - Lock Rust dependencies' versions. ([\matrix-org#13858](matrix-org#13858)) Synapse 1.68.0rc1 (2022-09-20) ============================== Features -------- - Keep track of when we fail to process a pulled event over federation so we can intelligently back off in the future. ([\matrix-org#13589](matrix-org#13589), [\matrix-org#13814](matrix-org#13814)) - Add an [admin API endpoint to fetch messages within a particular window of time](https://matrix-org.github.io/synapse/v1.68/admin_api/rooms.html#room-messages-api). ([\matrix-org#13672](matrix-org#13672)) - Add an [admin API endpoint to find a user based on their external ID in an auth provider](https://matrix-org.github.io/synapse/v1.68/admin_api/user_admin_api.html#find-a-user-based-on-their-id-in-an-auth-provider). ([\matrix-org#13810](matrix-org#13810)) - Cancel the processing of key query requests when they time out. ([\matrix-org#13680](matrix-org#13680)) - Improve validation of request bodies for the following client-server API endpoints: [`/account/3pid/msisdn/requestToken`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3account3pidmsisdnrequesttoken), [`/org.matrix.msc3720/account_status`](https://github.com/matrix-org/matrix-spec-proposals/blob/babolivier/user_status/proposals/3720-account-status.md#post-_matrixclientv1account_status), [`/account/3pid/add`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3account3pidadd), [`/account/3pid/bind`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3account3pidbind), [`/account/3pid/delete`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3account3piddelete) and [`/account/3pid/unbind`](https://spec.matrix.org/v1.3/client-server-api/#post_matrixclientv3account3pidunbind). ([\matrix-org#13687](matrix-org#13687), [\matrix-org#13736](matrix-org#13736)) - Document the timestamp when a user accepts the consent, if [consent tracking](https://matrix-org.github.io/synapse/latest/consent_tracking.html) is used. ([\matrix-org#13741](matrix-org#13741)) - Add a `listeners[x].request_id_header` configuration option to specify which request header to extract and use as the request ID in order to correlate requests from a reverse proxy. ([\matrix-org#13801](matrix-org#13801)) Bugfixes -------- - Fix a bug introduced in Synapse 1.41.0 where the `/hierarchy` API returned non-standard information (a `room_id` field under each entry in `children_state`). ([\matrix-org#13506](matrix-org#13506)) - Fix a long-standing bug where previously rejected events could end up in room state because they pass auth checks given the current state of the room. ([\matrix-org#13723](matrix-org#13723)) - Fix a long-standing bug where Synapse fails to start if a signing key file contains an empty line. ([\matrix-org#13738](matrix-org#13738)) - Fix a long-standing bug where Synapse would fail to handle malformed user IDs or room aliases gracefully in certain cases. ([\matrix-org#13746](matrix-org#13746)) - Fix a long-standing bug where device lists would remain cached when remote users left and rejoined the last room shared with the local homeserver. ([\matrix-org#13749](matrix-org#13749), [\matrix-org#13826](matrix-org#13826)) - Fix a long-standing bug that could cause stale caches in some rare cases on the first startup of Synapse with replication. ([\matrix-org#13766](matrix-org#13766)) - Fix a long-standing spec compliance bug where Synapse would accept a trailing slash on the end of `/get_missing_events` federation requests. ([\matrix-org#13789](matrix-org#13789)) - Delete associated data from `event_failed_pull_attempts`, `insertion_events`, `insertion_event_extremities`, `insertion_event_extremities`, `insertion_event_extremities` when purging the room. ([\matrix-org#13825](matrix-org#13825)) Improved Documentation ---------------------- - Note that `libpq` is required on ARM-based Macs. ([\matrix-org#13480](matrix-org#13480)) - Fix a mistake in the config manual introduced in Synapse 1.22.0: the `event_cache_size` _is_ scaled by `caches.global_factor`. ([\matrix-org#13726](matrix-org#13726)) - Fix a typo in the documentation for the login ratelimiting configuration. ([\matrix-org#13727](matrix-org#13727)) - Define Synapse's compatability policy for SQLite versions. ([\matrix-org#13728](matrix-org#13728)) - Add docs for the common fix of deleting the `matrix_synapse.egg-info/` directory for fixing Python dependency problems. ([\matrix-org#13785](matrix-org#13785)) - Update request log format documentation to mention the format used when the authenticated user is controlling another user. ([\matrix-org#13794](matrix-org#13794)) Deprecations and Removals ------------------------- - Synapse will now refuse to start if configured to use SQLite < 3.27. ([\matrix-org#13760](matrix-org#13760)) - Don't include redundant `prev_state` in new events. Contributed by Denis Kariakin (@dakariakin). ([\matrix-org#13791](matrix-org#13791)) Internal Changes ---------------- - Add a stub Rust crate. ([\matrix-org#12595](matrix-org#12595), [\matrix-org#13734](matrix-org#13734), [\matrix-org#13735](matrix-org#13735), [\matrix-org#13743](matrix-org#13743), [\matrix-org#13763](matrix-org#13763), [\matrix-org#13769](matrix-org#13769), [\matrix-org#13778](matrix-org#13778)) - Bump the minimum dependency of `matrix_common` to 1.3.0 to make use of the `MXCUri` class. Use `MXCUri` to simplify media retention test code. ([\matrix-org#13162](matrix-org#13162)) - Add and populate the `event_stream_ordering` column on the `receipts` table for future optimisation of push action processing. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13703](matrix-org#13703)) - Rename the `EventFormatVersions` enum values so that they line up with room version numbers. ([\matrix-org#13706](matrix-org#13706)) - Update trial old deps CI to use Poetry 1.2.0. ([\matrix-org#13707](matrix-org#13707), [\matrix-org#13725](matrix-org#13725)) - Add experimental configuration option to allow disabling legacy Prometheus metric names. ([\matrix-org#13714](matrix-org#13714), [\matrix-org#13717](matrix-org#13717), [\matrix-org#13718](matrix-org#13718)) - Fix typechecking with latest types-jsonschema. ([\matrix-org#13724](matrix-org#13724)) - Strip number suffix from instance name to consolidate services that traces are spread over. ([\matrix-org#13729](matrix-org#13729)) - Instrument `get_metadata_for_events` for understandable traces in Jaeger. ([\matrix-org#13730](matrix-org#13730)) - Remove old queries to join room memberships to current state events. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13745](matrix-org#13745)) - Avoid raising an error due to malformed user IDs in `get_current_hosts_in_room`. Malformed user IDs cannot currently join a room, so this error would not be hit. ([\matrix-org#13748](matrix-org#13748)) - Update the docstrings for `get_users_in_room` and `get_current_hosts_in_room` to explain the impact of partial state. ([\matrix-org#13750](matrix-org#13750)) - Use an additional database query when persisting receipts. ([\matrix-org#13752](matrix-org#13752)) - Preparatory work for storing thread IDs for notifications and receipts. ([\matrix-org#13753](matrix-org#13753)) - Re-type hint some collections as read-only. ([\matrix-org#13754](matrix-org#13754)) - Remove unused Prometheus recording rules from `synapse-v2.rules` and add comments describing where the rest are used. ([\matrix-org#13756](matrix-org#13756)) - Add a check for editable installs if the Rust library needs rebuilding. ([\matrix-org#13759](matrix-org#13759)) - Tag traces with the instance name to be able to easily jump into the right logs and filter traces by instance. ([\matrix-org#13761](matrix-org#13761)) - Concurrently fetch room push actions when calculating badge counts. Contributed by Nick @ Beeper (@Fizzadar). ([\matrix-org#13765](matrix-org#13765)) - Update the script which makes full schema dumps. ([\matrix-org#13770](matrix-org#13770)) - Deduplicate `is_server_notices_room`. ([\matrix-org#13780](matrix-org#13780)) - Simplify the dependency DAG in the tests workflow. ([\matrix-org#13784](matrix-org#13784)) - Remove an old, incorrect migration file. ([\matrix-org#13788](matrix-org#13788)) - Remove unused method in `synapse.api.auth.Auth`. ([\matrix-org#13795](matrix-org#13795)) - Fix a memory leak when running the unit tests. ([\matrix-org#13798](matrix-org#13798)) - Use partial indices on SQLite. ([\matrix-org#13802](matrix-org#13802)) - Check that portdb generates the same postgres schema as that in the source tree. ([\matrix-org#13808](matrix-org#13808)) - Fix Docker build when Rust .so has been built locally first. ([\matrix-org#13811](matrix-org#13811)) - Complement: Initialise the Postgres database directly inside the target image instead of the base Postgres image to fix building using Buildah. ([\matrix-org#13819](matrix-org#13819)) - Support providing an index predicate clause when doing upserts. ([\matrix-org#13822](matrix-org#13822)) - Minor speedups to linting in CI. ([\matrix-org#13827](matrix-org#13827)) # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE1508oLYUKainYFJakD7OEIo53t0FAmMy4FoACgkQkD7OEIo5 # 3t009g/6A4S26H6NG4GM44JD9+OB25fO59m9UAWWLrePmOKsBaGXVp86scPq9epI # vQbr4Czi8WEqCJlMRxIWLXv7BL3TLXnLF1vC0wSE6YiJqrPU9vMZ0UYxWNErl8Sr # eFBpuHXDlfppQUXs903iNmXVbdTpXCVjdTEwaZmgU1/FKydgU0o90PQseb/hnegX # hcJrepL6xhcs37fP2zdlixissLQ85WE10x6h7FX+SkCEHGvkiKrqSvXsS4ZN0Scn # vGCy3GD69j/ZpRu7RczdDwwzCRerg6r7spokRK/b6pzz9sXmCyY8SrrUyEEdzveN # uEEe4A8vmR3v0sR4Ao2cGZ7zy/jq6WyrWjmfOd+hfYD9tXx/g8190RFkRQrrkYVU # jTNdD6Zom0rtENEgHuFQ8joD96MNsaq4dvDefYYpcXDOh3YZnA/fiwfmG9XZRq6u # B42RZEtUZ3sjZ3VdRb3AvhPTTrY4kiwEqVBFSUTBKEAKXQtdrsiqv2QPvQTSC5BJ # YFXMwj32E7Zi3mTYjl60yggtBAxYt49KsHL8qAq75t6A38HpnYEyEW1R3S6VDmtp # rIR5ktxyzoGvxpJ4YPUdC4A2hbaOztwPvGE3iEDiPgUdkfb6m8x3MhaWhShid4Df # v2BQu+SFIl1wXPLxjlX2rmkQMhUGD2RGYmkUgYfoWnjdTtQV10w= # =4eYj # -----END PGP SIGNATURE----- # gpg: Signature made Tue Sep 27 12:36:58 2022 BST # gpg: using RSA key D79D3CA0B61429A8A760525A903ECE108A39DEDD # gpg: Can't check signature: No public key # Conflicts: # docker/Dockerfile # poetry.lock # synapse/push/bulk_push_rule_evaluator.py # synapse/push/push_tools.py # synapse/storage/databases/main/event_push_actions.py # synapse/storage/databases/main/events_worker.py # tests/replication/slave/storage/test_events.py
Part of #11106. The Prometheus recording rules are updated to use the futureproof names. This is compatible with both 'legacy' Synapse installations and experimental 'non-legacy' Synapse installations.
The Prometheus v1 recording rules are removed since Prometheus v1 is very old and we don't update that file anymore; it doesn't seem worth keeping.
(N.B. I also fix a metric that I forgot to rename in #13540)